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Abstract — We  introduce  a  new  technique,  component-based 
garbled  circuits,  for  increasing  the  efficiency  of  secure  two- 
party  computation  in  the  offline/online  semi-honest  setting.  We 
observe  that  real-world  functions  are  generally  constructed  in 
a  modular  way,  comprising  many  standard  components  for 
common  tasks  like  arithmetic  or  cryptographic  operations. 
Our  technique  allows  circuits  for  these  common  tasks  to  be 
garbled  and  shared  during  an  offline  phase;  once  the  function 
to  compute  is  specified,  these  pre-shared  components  can  be 
chained  together  to  create  a  larger  garbled  circuit.  We  stress 
that  we  do  not  assume  that  the  function  is  known  during  the 
offline  phase  —  only  that  it  uses  some  common,  predictable 
components. 

We  give  an  implementation,  CompGC,  of  this  technique 
and  measure  the  efficiency  gains  for  various  computations.  We 
compare  first  to  standard  garbled  circuit-based  secure  two- 
party  computation,  where  we  find  that  our  technique  results 
in  roughly  an  order  of  magnitude  performance  improvement. 
We  then  consider  a  set  of  machine  learning  classification 
computations  previously  studied  by  Bost  et  al.  (NDSS  2015) 
that  do  not  use  garbled  circuits.  We  find  that  our  component- 
based  technique  can  improve  online  performance  in  most  cases, 
including  an  order  of  magnitude  improvement  for  decision  tree 
classification. 

1.  Introduction 

Secure  two-party  computation  allows  a  pair  of  parties, 
each  with  private  input,  to  compute  a  function  of  those 
inputs  without  sharing  them  with  each  other.  This  is  an 
extremely  powerful  tool,  and  it  was  shown  by  Yao  to  be 
feasible  using  an  approach  termed  garbled  circuits  IYao861. 
Since  then,  a  long  line  of  work  has  aimed  to  increase  the 
efficiency  of  garbled  circuit-based  secure  computation.  This 
paper  continues  that  effort. 

In  particular,  our  goal  is  to  allow  the  use  of  offline 
pre-processing  to  significantly  reduce  online  computation 
time  for  garbled  circuit-based  computation.  This  is  not  a 
new  goal.  Beaver,  for  example,  showed  how  precomputation 
can  significantly  increase  the  online  speed  of  the  required 
oblivious  transfers  (OTs)  HBea95il.  Others  have  found  similar 
ways  to  increase  the  online  efficiency  of  the  cut-and-choose 
technique  needed  for  malicious  security  liHKK+  14l.  ILR14I. 
1LR15I.  There  is  also  a  long  history  of  precomputation  in 
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the  setting  of  non-garbled  circuit-based  two-party  computa¬ 
tion  BDPSZ121.  I1NNOB12I. 

In  the  semi-honest  setting  in  which  all  of  our  construc¬ 
tions  work,  it  has  long  been  known  that  precomputation 
can  greatly  increase  efficiency  if  the  function  is  known 
ahead  of  time,  with  only  the  inputs  specified  at  the  time 
of  online  computation.  The  protocol  is  simple:  the  garbler 
computes  the  entire  garbled  circuit  ahead  of  time,  with 
only  OT  computations  (which  can  also  be  preprocessed,  but 
still  require  some  online  communication),  communication  of 
the  inputs,  and  evaluation  done  online.  However,  requiring 
that  the  function  be  known  ahead  of  time  is  a  substantial 
limitation. 

In  this  work,  we  show  a  way  to  achieve  a  similar  benefit 
without  prior  knowledge  of  what  circuit  will  be  computed. 
Towards  this  goal,  we  note  that  most  functions  of  interest 
are  built  in  a  modular  way.  Just  as  a  programmer  writes  code 
for  a  complex  function  by  using  existing  simpler  functions, 
the  circuits  for  these  functions  use  components  that  perform 
common  tasks.  There  might  be  a  portion  of  the  circuit  that 
takes  the  maximum  of  two  numbers,  for  example,  or  that 
computes  a  hash  function.  We  show  that  one  can  precompute 
garbled  circuits  for  these  smaller  components  and  then  chain 
them  together  in  the  online  phase  when  the  function  to  be 
computed  is  specified.  We  call  this  component-based  garbled 
circuit  construction.  We  show  cryptographic  protocols  for 
carrying  it  out,  and  we  provide  an  open-source  implementa¬ 
tion,  CompGC,  that  achieves  large  efficiency  gains,  upwards 
of  an  order  of  magnitude  improvement  in  online  computa¬ 
tion  time,  versus  standard  garbled  circuit-based  secure  two- 
party  computation. 

We  can  imagine  this  system  being  used  in  several  differ¬ 
ent  ways.  In  the  most  narrow  case,  parties  may  know  roughly 
what  sort  of  function  will  be  computed.  For  example,  they 
might  be  unsure  only  of  the  input  length.  In  this  case, 
they  can  compute  a  narrow  set  of  components  specifically 
tailored  to  that  function.  This  incurs  slightly  greater  total 
computation  cost  in  exchange  for  greatly  improved  online 
speed. 

In  a  more  general  setting,  parties  might  engage  fre¬ 
quently  in  computation  of  a  given  general  type.  A  library  of 
common  operations  might  be  developed  for  that  particular 
type  of  computation.  Cryptographic  functions,  for  example, 
commonly  rely  on  a  small  set  of  operations,  including  large 
components  like  those  for  computing  standard  hash  func¬ 
tions  and  blockciphers  and  smaller  components  for  simple 


tasks  like  bitwise  XOR  of  two  strings.  Geometric  compu¬ 
tations,  on  the  other  hand,  might  require  a  large  number 
of  matrix  operations.  Other  libraries  could  be  developed 
for  computations  in  machine  learning,  finance,  or  other 
general  areas,  or  specifically  tuned  to  the  needs  of  a  larger 
application  of  which  the  secure  computation  was  part. 

Finally,  in  the  most  general  setting,  parties  engaging  in 
a  great  deal  of  computation  over  time  could  compute  an 
enormous  library  with  a  huge  number  of  possible  compo¬ 
nent  types.  This  would  allow  extraordinarily  fast  (online) 
computation  of  a  wide  array  of  functions. 

We  note  that  in  the  last  two  use  cases  discussed  above, 
substantial  storage  would  be  required.  There  would  also  be 
significant  setup  cost.  However,  components  in  our  scheme 
that  are  not  used  for  one  computation  can  be  saved  for  the 
next.  That  means  that  the  component  library  that  the  parties 
have  precomputed  can  be  maintained  simply  by  replacing 
used  components.  As  a  result,  the  amortized  total  cost  of 
each  computation  is  not  greatly  increased,  and  latency  is 
drastically  reduced.  We  also  allow  load  balancing,  since  par¬ 
ties  can  replace  used  components  whenever  computational 
resources  are  available. 

1.1.  Our  Contributions 

Our  contributions  go  well  beyond  pointing  out  the  ability 
to  divide  a  circuit  into  pieces.  We  give  formal  specifica¬ 
tions  for  how  to  create  and  connect  components.  We  also 
give  a  practical,  open-source  implementation,  CompGC,  and 
show  experimentally  that  our  method  allows  for  drastically 
reduced  online  computation.  Specifically,  we  make  the  fol¬ 
lowing  contributions. 

Component-based  garbled  circuits.  We  give  a  protocol  for 
precomputing  garbled  circuits  for  given  components,  and 
for  combining  these  components  as  needed  at  runtime.  We 
show  that  security  is  maintained  by  this  protocol.  This 
construction  allows  arbitrary  linkage  between  component 
wires  while  requiring  online  communication  of  only  one 
label  per  component  input  wire.  We  note  that  this  technique 
is  very  similar  to  the  “partial  garbled  circuits”  of  Mood  et 
al.  IIMGBF14I.  although  it  was  used  for  a  different  purpose 
in  that  work  and,  as  described,  required  two  labels  per 
connection,  whereas  we  only  need  a  single  label  per  connec¬ 
tion.  Additionally,  a  long  line  of  work  INQ09I.  llFJN+13ll. 
IFJNT151  building  on  the  so  called  “LEGO”  approach  to 
maliciously  secure  garbled  circuits  uses  essentially  the  same 
technique  to  solder  garbled  circuits  out  of  individual  pre¬ 
garbled  NAND  gates.  None  of  these  three  papers  give  an 
implementation  or  experimental  evaluation  to  demonstrate 
the  practical  benefit  of  this  technique  for  real  applications. 

CompGC  implementation.  We  develop  our  own  standalone 
library  libgarble  for  garbling  circuits.  Our  library  is  based  on 
the  JustGarble  implementation  of  Bellare  et  al.  I  Bl  IKK  1 3 1, 
but  makes  many  internal  improvements  to  the  codebase. 
None  of  these  improvements  constitute  theoretical  improve¬ 
ments  to  the  underlying  algorithm,  but  rather  optimizations 
of  the  code.  For  example,  we  revise  the  data  structure  by 


which  circuits  are  stored  in  order  to  speed  access  to  certain 
data.  We  believe  this  is  a  valuable  contribution  on  its  own, 
and  it  is  relevant  even  when  not  using  our  component-based 
precomputation  strategy.  Our  library  improves  the  perfor¬ 
mance  of  garbling  and  evaluating  an  AES  circuit  by  10% 
and  22%,  respectively,  as  compared  to  JustGarble,  along 
with  many  other  improvements,  including  support  for  half¬ 
gates  IZRE 151  and  privacy-free  garbled  circuits  EFNQ151 
alongside  a  consistent  API. 

We  then  use  libgarble  as  a  building  block  to  create  a 
complete  secure  computation  system,  CompGC.  This  tool 
allows  parties  to  precompute  any  specified  library  of  compo¬ 
nents  during  the  offline  phase,  using  libgarble  to  garble  each 
component.  During  the  online  phase,  it  creates  a  series  of 
instructions  for  the  evaluator  that  allows  the  chaining  of  the 
relevant  components,  and  it  handles  the  extra  computation 
(outside  of  garbling  and  evaluation)  that  is  required  to 
distribute  the  input  wire  labels  and  decipher  the  output  wire 
labels. 

Experimental  results.  We  use  this  implementation  to  conduct 
several  experiments.  We  first  compare  to  prior  work  on  gar¬ 
bled  circuits.  Specifically,  we  consider  three  computations: 
(1)  computing  AES  using  a  single-round  AES  component  as 
a  building  block;  (2)  using  this  single -round  AES  component 
to  allow  for  encryption  of  arbitrary  length  messages  using 
CBC  mode;  and  (3)  computing  Levenshtein  distance,  which 
can  be  used  for  any  number  of  applications,  including 
text  processing  and  genetic  analysis.  Computations  (2)  and 
(3)  above  model  a  setting  where  only  the  input  length  is 
unknown  during  precomputation.  Computations  (1)  and  (2) 
also  model  a  setting  where  a  library  of  standard  crypto¬ 
graphic  components  is  used. 

We  measure  total  online  time  required  to  perform  the 
secure  computation  over  both  localhost  and  a  simulated 
realistic  network  configuration.  In  all  of  these  measure¬ 
ments,  we  see  substantial  efficiency  improvements  due  to 
precomputation.  For  example,  when  computing  Levenshtein 
distance  between  two  60  symbol  strings,  where  each  symbol 
comes  from  an  8-bit  alphabet,  we  see  a  greater  than  order 
of  magnitude  improvement  (from  10.6  seconds  to  752  mil¬ 
liseconds)  when  using  our  approach  over  the  naive  approach 
of  sending  the  circuit  online. 

Next,  we  consider  a  broad  class  of  machine  learn¬ 
ing  classification  computations  first  considered  by  Bost  et 
al.  IBPTG15I.  This  is  an  ideal  setting  for  a  library  that  is 
limited  and  tailored  to  a  specific  application  domain,  but 
which  can  nevertheless  carry  out  a  substantial  variety  of 
computations.  We  show,  using  the  observations  of  Bost  et 
al.,  that  a  library  with  only  a  small  number  of  simple  com¬ 
ponents  is  sufficient  to  allow  very  fast  online  computation  of 
many  classification  functions.  We  then  show  that  by  using 
garbled  circuits  together  with  our  component-based  tech¬ 
niques  we  can  in  most  cases  improve  online  performance 
compared  Bost  et  al.,  including  a  drastic  improvement  for 
decision  tree  classifiers. 

All  of  our  work  is  done  in  the  semi-honest  model.  We 
believe  there  are  many  use  cases  of  secure  computation  for 


which  semi-honest  security  is  sufficient.  For  example,  when 
two  mutually  trusting  companies  or  agencies  are  prevented 
from  sharing  data  by  policy  or  legal  restrictions,  but  other¬ 
wise  trust  each  other  to  behave  honestly.  We  also  view  semi- 
honest  security  as  a  natural  stepping  stone,  and  we  expect 
these  techniques  can,  with  additional  work,  be  extended  to 
the  malicious  setting  as  well. 


1.2.  Paper  Organization 


The  remainder  of  this  paper  is  organized  as  follows.  Sec¬ 


tion  2  summarizes  the  related  prior  work.  Section  3  provides 
background  information  on  garbled  circuits  and  secure  two- 
party  computation,  introducing  the  necessary  notation  that 
we  use  in  the  remainder  of  the  paper.  Section  4|  describes 


Section  5 


our  component-based  garbled  circuit  technique, 
provides  the  details  on  our  prototype  implementation  of  the 


described  primitives  and  Section  6  gives  the  experimental  re¬ 
sults  evaluating  the  performance  of  our  schemes  for  several 
common  classes  of  functions.  We  conclude  inISection  71 


2.  Related  Work 


Garbled  circuits  were  first  introduced  by  Yao  in  the 
1980s  fYao86l  as  a  tool  for  general  secure  two-party  compu¬ 
tation.  While  they  were  originally  viewed  mainly  as  a  theo¬ 
retical  tool,  this  view  has  changed  significantly  over  the  past 
decade  or  so.  Starting  with  the  Fairplay  system  of  Malkhi 
et  al.  [IMNPS04I.  garbled  circuits  have  been  built  into  pro¬ 
totypes  of  secure  computation.  This  has  led  to  a  long  line 
of  work  (e.g.  IBHKR131.  IlHKS+lOl.  «HEKMlll.  lKsS12l. 
llLRBl.  IMallll.  IMGBF14I.  IPSSW091.  ISHS+1511  that 
aims  to  improve  the  efficiency  of  garbled  circuits  and  to 
build  usable  and  practical  systems  for  various  real-world 
applications.  Out  of  this  work,  the  most  efficient  known 
implementations  (not  using  specialized  massively-parallel 
hardware  HKsSl  21)  of  general  garbled  circuit-based  compu¬ 
tation  are  TinyGarble  llSHS+15l  for  security  against  semi- 
honest  adversaries,  which  is  based  on  the  efficient  garbling 
procedure  introduced  by  JustGarble  IBHKR1 31.  and  the 
“Blazing  Fast  2PC”  system  ILR15  I  for  malicious  adversaries 
(in  the  offline/online  model). 

One  method  for  increasing  the  efficiency  of  garbled 
circuit-based  secure  computation  is  to  work  in  the  of¬ 
fline/online  model  and  use  preprocessing  to  reduce  the  on¬ 
line  running  time.  A  substantial  line  of  work  has  focused  on 
reducing  the  cost  of  the  cut-and-choose  technique  [LP07]  for 
malicious  security  using  preprocessing  fHKK+  14  ,  ILR14I. 
EES.  However,  all  of  these  works  require  that  the  function 
to  compute  be  defined  during  the  pre-processing  phase.  Our 
goal  is  to  allow  the  benefits  of  pre-processing  even  when 
one  knows  little  about  the  function  that  might  be  computed. 

In  attempting  to  increase  the  online  efficiency  of  secure 
computation,  we  are  guided  by  many  prior  works  that 
identified  as  a  major  bottleneck  the  time  and  bandwidth 
necessary  to  transmit  the  garbled  circuit  to  the  evalua¬ 
tor.  Several  works  IKMR141.  fKSMI.  INPS99I.  flPSSW09«. 
IZREI51  aim  to  reduce  the  size  of  the  circuit  that  must  be 


communicated  between  the  generator  and  evaluator.  We  see 
this  paper  as  continuing  this  effort,  aiming  to  reduce  the 
amount  of  communication  necessary  in  the  online  phase  of 
garbled  circuit  evaluation.  While  we  do  not  further  reduce 
the  overall  size  of  the  garbled  circuit  to  be  transmitted,  we 
significantly  reduce  the  amount  of  communication  necessary 
in  the  online  phase,  after  the  function  to  compute  and  the 
inputs  are  chosen. 

As  communication  is  the  main  bottleneck,  Gueron  et 
al.  IIGLNP15I  argue  that  the  speed  improvements  made  by 
JustGarble  disappear  due  to  the  need  to  transmit  the  circuit. 
Because  we  send  the  circuit  components  in  the  offline  phase, 
communication  is  no  longer  the  bottleneck  and  we  can  thus 
reap  all  the  performance  benefits  of  using  a  JustGarble- 
based  garbling  library. 

The  idea  of  breaking  circuits  into  smaller  pieces  ap¬ 
peared  previously  in  the  work  of  Mood  et  al.  IMGBF14I. 
where  it  was  called  “partial  garbled  circuits”.  Rather  than 
use  it  to  reduce  online  computation  and  communication 
time  as  we  do  here.  Mood  et  al.  used  it  as  a  way  to  reuse 
values  in  internal  gates  of  a  garbled  circuit  across  multiple 
computations.  Their  technique  also  requires  sending  two 
correction  labels  per  wire,  whereas  we  can  do  it  with  just 
one.  Additionally,  several  prior  works  using  the  “LEGO” 
approach  to  building  garbled  circuits  lNQ09l.  llFJN+13l. 
EFJNT15II  use  this  idea  to  assemble  circuits  out  of  pre¬ 
garbled  NAND  gates. 

Secure  computation  of  machine  learning  classifiers.  A  clas¬ 
sifier  (or  model)  is  the  output  of  a  machine  learning  tech¬ 
nique  when  applied  to  a  training  data  set.  The  resulting  clas¬ 
sifier  is  capable  of  labeling  new  data  points.  For  example, 
a  hospital  might  have  a  classifier  that  can  indicate  whether 
a  particular  patient’s  medical  record  is  indicative  of  a  given 
disease.  Because  that  classifier  was  constructed  from  private 
patient  data  (or  in  other  settings,  because  it  is  proprietary) 
it  cannot  be  released  publicly.  Similarly,  a  patient  might 
not  want  to  freely  transmit  their  medical  records.  Securely 
computing  classifier  functions  allows  the  patient  to  learn 
the  result  of  the  classifier  without  either  the  model  or  the 
patient’s  medical  records  being  shared. 

There  are  several  works  (e.g.,  IBFK+09I.  (EFG+09]]) 
that  show  how  to  securely  compute  specific  classifiers,  but 
the  only  work  that  shows  how  to  compute  a  broad  set  of 
classifiers  is  that  of  Bost  et  al.  IBPTG15 1.  Bost  et  al.  show 
that  three  component  operations  (comparison,  argmax,  and 
dot  product)  are  sufficient  for  computing  hyperplane  de¬ 
cision,  naive  Bayes,  and  decision  tree  classifiers.  Because 
many  different  machine  learning  algorithms  use  classifiers 
of  one  of  these  three  types,  this  means  that  three  component 
operations  are  sufficient  for  carrying  out  secure  classification 
using  a  wide  range  of  underlying  machine  learning  algo¬ 
rithms. 

Bost  et  al.  show  tailored  protocols  for  secure  compu¬ 
tation  of  these  components  and  therefore  of  the  classifiers, 
which  they  claim  to  be  much  more  efficient  than  “general” 
garbled  circuit-based  computation. 


garbled  table  entry 


3.  Preliminaries 

In  this  section  we  briefly  introduce  the  notation  and  key 
primitives  that  we  use,  as  well  as  some  background. 

3.1.  Garbled  Circuits 
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Garbled  circuits  are  the  main  tool  used  for  all  of  our  con¬ 
structions.  Our  presentation  here  follows  IGLNP151.  HLR14I 
which  is  adapted  from  IBHR121.  and  we  refer  the  reader  to 
those  works  for  a  more  detailed  presentation. 

Garbled  circuits,  proposed  originally  by  Yao  IYao86l. 
are  a  way  of  encoding  a  Boolean  circuit  that  allow  for  secure 
evaluation  of  the  function  computed  by  that  circuit.  This 
encoding  has  the  property  that  given  encodings  of  values 
for  each  input  wire,  it  is  possible  to  evaluate  the  function 
computed  by  this  circuit  (i.e.,  learn  the  values  of  the  output 
wires)  without  learning  the  values  of  the  input  wires  or  any 
of  the  internal  circuit  wires.  This  enables  two-party  secure 
computation  where  one  party  produces  the  garbled  circuit 
and  the  input  labels,  and  the  other  party  evaluates  the  circuit 
to  produce  the  output.  This  is  described  in  more  detail  in 
ISection  3.31 

More  formally,  a  garbling  scheme  consists  of  two  algo¬ 
rithms  (Garble,  Eval).  On  input  a  security  parameter  1K 
and  a  circuit  C,  Garble(1k,  C)  returns  the  triple  ( GC ,  e,  d) 
where  GC  is  the  garbled  circuit,  e  is  the  ordered  set  of  input 
wire  labels  {(IT°,  Wl)}ie  inputs(Cb  anc*  ^ 's  the  ordered  set 
of  output  labels  {(W,P,  Wi1)}ie0utputs(C)- 

Given  a  garbled  circuit  GC  and  a  set  of  input  labels 
x  =  { }jginputs(c)>  Eval(GC,  X)  computes  the  gar¬ 
bled  output  Z  such  that  using  the  set  d,  it  is  possible  to 
recover  the  actual  output  z  (i.e.,  by  finding  Z  in  the  ordered 
set  of  output  labels). 


Example.  The  most  straightforward  example  of  a  garbled 
circuit  is  Yao’s  original  scheme.  Each  wire  Wj  has  two 
associated  labels,  W®  and  W] ,  corresponding  to  values  0 


and  1  respectively.  For  each  gate  there  is  a  table  like  Table  1 


This  table  contains  encryptions  of  the  labels  for  the  gate’s 
output  wire,  using  the  labels  of  the  input  wires  as  keys.  The 
encryptions  are  chosen  so  that  the  evaluator,  knowing  the 
labels  of  the  two  input  wires,  can  decrypt  the  proper  label 
of  the  output  wire  (and  nothing  else).  Repeated  evaluation  of 
gates  then  propagates  knowledge  of  the  correct  wire  labels 
(for  whatever  initial  input  labels  were  given)  through  the 
entire  circuit. 


Privacy.  In  order  to  be  useful  for  secure  two-party  com¬ 
putation,  it  is  necessary  that  garbled  circuits  satisfy  the 
following  privacy  notion.  The  values  seen  by  the  evaluator, 
GC,  d,  and  X,  should  not  reveal  any  information  about 
x  that  is  not  revealed  by  the  output  C[x).  Formally,  we 
require  that  there  exist  a  polynomial  time  simulator  S  that  on 
input  (1K,C,  C(x))  outputs  a  simulated  garbled  circuit  that 
is  indistinguishable  from  ( GC ,  e,  d)  generated  by  Garble. 
Since  S  knows  C(x)  but  not  x,  this  captures  the  fact  that 
the  output  of  Garble  does  not  reveal  anything  (else)  about 
x. 


TABLE  1:  Garbled  AND  Gate.  Only  the  values  in  the  last 
column  are  sent  to  the  evaluator.  If  the  input  wires  have 
values  a  and  b,  then  the  evaluator  knows  Wq  and  W\  and 
can  therefore  decrypt  W^tb. 


Free-XOR.  Our  constructions  make  use  of  one  critical 
improvement  to  the  original  garbled  circuits  called  free- 
XOR  [KS08I.  which  allows  for  XOR  gates  to  be  evalu¬ 
ated  “for  free”  without  requiring  any  garbled  tables  to  be 
included  in  the  garbled  circuit.  Specifically,  this  technique 
works  by  choosing  a  global  random  value  R  and  then 
ensuring  that  the  labels  for  all  circuit  wires  have  a  difference 
of  R.  That  is,  for  any  wire  Wi,  W®  ®  W}  =  R .  This  enables 
the  secure  evaluation  of  an  XOR  gate  by  simply  computing 
the  XOR  of  the  two  incoming  labels,  as  R  cancels  out 
appropriately. 

3.2.  Oblivious  Transfer 

Another  key  component  for  secure  two-party  computa¬ 
tion  is  oblivious  transfer  (OT)  IEGL821.  [Rab05|.  OT  is 
a  two-party  primitive  where  one  party  (the  sender)  has  as 
input  two  K-bit  strings  (mo,  mi)  and  the  other  party  (the 
receiver)  has  a  bit  b.  OT  enables  the  receiver  to  receive  mb 
from  the  sender,  while  preventing  the  sender  from  learning 
which  string  was  received  (the  value  of  b )  and  preventing 
the  receiver  from  learning  anything  about  TOi_b.  In  this 
paper  we  use  the  semi-honest  OT  construction  by  Naor  and 
Pinkas  I1NP99II. 

One  technique  for  optimizing  OT  that  we  make  criti¬ 
cal  use  of  is  OT  preprocessing  IBea951.  OT  preprocessing 
allows  splitting  any  OT  protocol  into  an  expensive  offline 
phase  and  a  much  cheaper  online  phase.  Specifically,  in  the 
offline  phase,  before  the  inputs  are  known,  OT  is  performed 
on  random  inputs  for  both  the  sender  and  receiver.  This 
requires  a  number  of  expensive  cryptographic  operations. 
However,  in  the  online  phase  the  pre-OT’d  values  are  used  to 
perform  the  OT  on  the  parties’  actual  inputs  without  needing 
any  additional  expensive  operations. 

3.3.  Secure  Two-party  Computation 

We  now  briefly  describe  how  garbled  circuits  and  obliv¬ 
ious  transfer  can  be  used  to  realize  secure  two-party  com¬ 
putation.  That  is,  to  enable  two  parties  to  compute  a  joint 
function  on  their  inputs  without  either  party  learning  more 
than  what  is  implied  by  its  input  and  output.  In  this  work  we 
focus  on  two-party  computation  that  is  secure  against  a  semi- 
honest  adversary  corrupting  either  of  the  two  parties.  That 
is,  such  an  adversary  follows  the  protocol  as  specified,  but 


attempts  to  learn  extra  information  from  its  interactions.  For 
a  formal  treatment  of  the  security  of  two-party  computation 
we  refer  readers  to  the  book  by  Goldreich  |Gol09l. 

In  garbled  circuit-based  two-party  computation  of  circuit 
G,  we  identify  the  two  parties  as  the  garbler  who  has  input 
x  and  the  evaluator  who  has  input  y.  The  garbler  first  runs 
Garble(G)  to  produce  (GG,  e,d).  He  then  sends  GC  and 
an  encrypted  form  of  d  to  the  evaluator  together  with  the 
wire  labels  corresponding  to  the  bits  of  the  garbler’s  input 
x.  The  encrypted  form  D  of  d  corresponds  to  a  random 
permutation  of  {Encwo(0),  Enc^.i (l)}ieoutputs(C)- 

Now,  for  each  bit  of  the  evaluator’s  input  y,  the  garbler 
and  evaluator  run  an  OT  protocol  by  which  the  evaluator 
learns  the  appropriate  wire  label  (without  revealing  that  bit 
of  y  to  the  garbler).  Now,  the  evaluator  has  all  the  inputs  to 
run  Eval (GC,  X)  to  recover  the  output  wire  labels.  It  then 
uses  these  wire  labels  to  decrypt  the  entries  in  D  to  learn 
the  appropriate  output.  If  output  by  both  parties  is  desired, 
the  evaluator  can  send  this  output  to  the  garbler. 


4.  Component-Based  Garbled  Circuits 


We  introduce  the  concept  of  component-based  garbled 
circuits  to  allow  for  much  of  the  work  involved  in  building 
and  transmitting  a  garbled  circuit  to  be  done  in  an  offline 
phase  before  the  inputs  or  even  the  function  to  compute  are 
known.  This  allows  us  to  significantly  improve  the  online 
performance  of  secure  two-party  computation  schemes  using 
garbled  circuits.  Our  improvements  stem  from  the  observa¬ 
tion  that  a  common  way  to  build  circuits  (and  programs) 
is  to  compose  them  out  of  common  building  blocks  or 
components.  For  example,  common  components  such  as 
circuits  for  arithmetic  operations,  cryptographic  functions, 
and  text  processing  can  form  the  base  for  large  classes  of 
general  computation. 

We  show  how  to  take  advantage  of  such  common  com¬ 
ponents  for  designing  efficient  garbled  circuits.  Specifically, 
our  approach  is  to  pre-garble  a  large  number  of  common 
component  circuits  in  an  offline  phase.  Note  that  we  do 
not  need  to  know  the  computation  to  be  performed  (besides 
the  generic  components  used  to  create  said  computation)  or 
the  inputs  during  this  offline  phase.  Then,  in  an  efficient 
online  phase,  we  show  how  to  link  these  components  to 
form  the  actual  circuit  we  wish  to  compute.  We  only  need 
to  send  a  single  wire  label  for  each  of  the  input  wires  in 
each  component.  Even  if  components  are  all  single  gates, 
this  is  corresponds  to  sending  only  one  label  per  wire, 
which  is  half  the  size  of  the  best  known  garbled  circuit 
construction  IZREI51.  However,  components  will  rarely  be  a 
single  gate.  We  believe  that  in  many  applications  (including 
those  used  in  our  experiments)  circuits  will  use  many  large 
components,  and  all  wires  internal  to  a  given  component 
require  no  communication  at  all.  Since  the  time  to  com¬ 
municate  the  garbled  circuit  is  the  major  bottleneck,  this 
leads  to  significant  savings  in  the  overall  garbled  circuit 
computation;  see  Section  6]  for  details. 


More  technically,  a  component-based  garbling  scheme 
is  a  triple  of  algorithms  (Garble, Link,  Eval).  Garble 


and  Eval  are  variants  on  the  corresponding  methods  for 
standard  garbled  circuits,  while  Link  is  new. 

Garble.  The  Garble  procedure  is  unchanged,  but  now  is 
given  a  component  c  as  input  (in  place  of  a  complete  circuit 
C).  Garble(c)  outputs  the  garbled  component  GCc,  input 
wire  set  ec,  and  output  wire  set  dc,  for  this  component. 

Link.  On  input  two  garbled  components  Co  =  (GGo,  cq.  do  ) 
and  ci  =  (GCi,  e-\ .  d\ )  as  well  as  a  mapping  of  output  wires 
of  co  to  input  wires  of  c\.  Link  produces  the  link  labels 
needed  to  convert  from  co  output  wires  to  Ci  input  wires. 
Specifically,  suppose  that  output  wire  Wi  of  Co  has  labels 
and  input  wire  Wj  of  C\  has  labels  (W-,W-). 
Then,  to  link  these  two  wires.  Link  outputs  Wl3  =  W®  © 
Wj .  Note  that  since  we  use  the  free-XOR  optimization,  we 
know  that  both  Wt°  =  W}  ©  R  and  W°  =  w)  ©  R  for 

J  J  _ Q 

some  random  value  R.  Therefore,  we  have  that  WP  ©  W ,  = 

_ j  1  3 

W}  ©  Wj ,  so  a  single  label  W,:j  is  sufficient  to  connect  both 
the  zero  and  the  one  wire  labels.  This  allows  us  to  reduce 
the  communication  necessary  to  one  label  per  component 
wire  (together  with  a  specification  of  which  wire  to  link  to 
which  wire). 


Eval.  On  input  a  list  of  garbled  components  { c, }  and 
linking  labels  {Wij},  Eval  computes  the  garbled  outputs 
{Yi}  as  follows.  Starting  from  the  inputs,  Eval  proceeds 
component  by  component,  evaluating  each  component  to 
get  the  component  output  wire  labels.  When  appropriate,  it 
uses  these  component  output  wire  labels  together  with  the 
appropriate  link  labels  to  recover  the  input  labels  for  later 
components.  Finally,  once  all  the  components  are  evaluated, 
Eval  recovers  the  garbled  outputs  { Y, }  from  the  output 
components  and  uses  d  for  that  component  to  recover  the 
(real)  output  y. 

For  details  on  the  exact  garbling  scheme  used  to  garble 
the  components,  the  format  for  indicating  which  wires  to 
link,  and  several  further  optimization  improvements,  we 


refer  the  reader  to  the  implementation  details  in  Section  5 


Privacy.  We  now  show  how  to  adapt  the  standard  privacy 
definition  for  garbled  circuits  IBHR12I  to  the  component- 
based  setting.  Specifically,  for  a  set  of  components 
{cijieComponentsi  we  want  that  the  pre-garbled  components 
{GCi},  together  with  the  input  labels  {W}'3  }jelnputs(C)> 
and  the  output  map  dcout  as  well  as  all  the  link  labels 
^Components  do  not  reveal  any  information  about 
x.  Formally,  as  in  the  case  of  garbled  circuits,  we  require 
that  there  exist  a  polynomial  time  simulator  S  that  on 
input  (1K,  G,  C(x)),  where  G(-)  is  some  polynomial  size 
circuit,  outputs  simulated  component  garbled  circuits  for  all 
components  in  G,  input  and  output  labels,  as  well  as  all 
the  linking  labels  Wl:]  for  linking  all  necessary  wires  that 
are  indistinguishable  from  ({GG}j,  e|nput(C),  doutput(C) )  and 
W^  generated  by  the  real  Garble  and  Link  procedures. 
Formally,  security  is  captured  by  the  following  game: 

The  privacy  experiment  Expt^1^  («:.): 

1)  Invoke  adversary  A:  compute  (C,x)  <—  _4(1K). 


2)  Choose  a  random  b  {0, 1}. 

3)  If  b  =  0:  For  each  component  Ci  in  C,  com¬ 
pute  (GCi,ei,di)  Garble(1k,  c).  Addi¬ 
tionally,  for  each  pair  of  components  ( Ci,Cj ) 
that  need  to  be  linked,  compute  all  the  link 
labels  {Wij}  <—  LlNK(ci, cf).  Finally,  com¬ 
pute  input  labels  X  =  {Wf'jiginputsCC)  and 
output  map  doutput(C)-  Then  output  challenge 
r  =  ({GCJ,  {Wij},X,  doutput(C))- 

If  b  =  1:  Compute  r  <—  <S(1K,  C.  C(x)). 

4)  Give  A  the  challenge  r  and  obtain  a  guess 
b'  «—  A(t). 

5)  Output  1  if  and  only  if  b'  =  b. 


Definition  1.  A  component-based  garbled  circuit  scheme 
achieves  privacy  if  for  every  probabilistic  polynomial  time 
A  there  exists  a  probabilistic  polynomial  time  simulator  S 
and  a  negligible  function  pf)  such  that  for  every  k  £  N: 


Pr 


(K)  =  1 


<  2  +  M«) 


4.1.  Component-Based  Secure  Two-Party  Compu¬ 
tation 


We  now  briefly  describe  how  to  use  component-based 
garbled  circuits  for  secure  two-party  computation.  In  an 
offline  stage,  before  inputs  or  even  the  computation  to 
be  performed  are  known,  the  garbler  runs  Garble  on  a 
number  of  components  to  pre-garble  these  components;  it 
then  sends  { GG,  jieComponents  and  an  encrypted  form  D 
of  doutput(C)  (as  specified  in  Section  3.3 1  to  the  evaluator. 
These  components  are  circuit  building  blocks  that  comprise 
the  eventual  computation;  however,  their  exact  linking  is  not 
determined  at  this  time.  In  parallel,  the  garbler  and  evaluator 
preprocess  a  number  of  instances  of  OT.  Both  the  garbler 
and  the  evaluator  store  the  received  garbled  components  and 
preprocessed  OTs. 

When  the  function  /  to  compute  and  the  inputs  ( x ,  y) 
are  known,  the  garbler  assembles  the  circuit  C  out  of  the 
garbled  components  {ci}.  For  each  component  pair  that 
needs  to  be  linked,  the  garbler  runs  Lin  k(c,,  cf)  and  sends 
the  link  labels  W,:)  along  with  the  indices  of  the  wires  to 
be  linked  to  the  evaluator.  Additionally,  the  garbler  sends 
the  input  labels  {Wfl }  for  the  garbler’s  inputs.  Finally,  the 
garbler  and  evaluator  complete  the  online  phase  of  the  OTs 
to  retrieve  the  labels  { Wf’ }  for  the  evaluator’s  input.  Given 
this  information,  the  evaluator  runs  Eval  to  compute  the 
circuit. 


4.2.  Analysis 

To  analyze  the  performance  of  component-based  2PC, 
we  look  separately  at  the  online  and  offline  phases.  In  the 
offline  phase  the  garbling  and  transmission  of  garbled  com¬ 
ponents  is  similar  to  the  total  communication  normally  done 
to  garble  and  send  a  circuit.  However,  this  communication 
can  be  done  offline  thus  not  affecting  the  online  running 
time.  The  online  phase,  on  the  other  hand,  only  sends  one 


link  label  per  pair  of  wires  connecting  any  components.  So, 
in  total,  the  online  communication  necessary  is  just  one  label 
for  each  component  input  wire  (along  with  information  on 
which  input  wires  map  to  which  output  wires).  We  note  that, 
even  in  the  case  when  components  are  just  single  gates,  this 
still  enables  us  to  achieve  communication  of  one  label  per 
gate  (and  XOR  gates  remain  free).  This  is  50%  savings  over 
the  best  known  construction  [ZRE151  (again,  discounting  the 
metadata  required  to  link  these  wires  together).  In  the  more 
realistic  case,  where  components  are  substantially  larger,  the 
savings  can  be  much  greater. 

This  analysis  assumes  that  the  same  circuit  is  used  in 
both  cases.  In  reality,  requiring  circuits  be  built  of  pre¬ 
made  components  will  change  their  structure.  Component- 
based  construction  limits  global  circuit  optimization  because 
components  must  be  treated  as  impermeable  black  boxes. 
However,  doing  careful  circuit  optimization  for  each  func¬ 
tion  is  hard  to  begin  with.  It  is  an  expensive  computation, 
and  the  time  to  optimize  the  circuit  counteracts  (or  even 
eliminates)  the  efficiency  gains  that  optimization  would  pro¬ 
vide.  When  building  a  library  of  components,  one  can  very 
carefully  optimize  each  component,  since  the  optimization 
is  an  offline  computation  and  is  amortized  over  many  uses 
of  each  component.  As  a  result,  the  feasible  amount  of 
circuit  optimization  might  be  substantially  higher  with  the 
component-based  approach. 

4.3.  Security 

We  now  sketch  a  proof  of  security  for  our  offline/online 
construction.  Roughly,  what  we  need  to  prove  is  that  the 
added  linking  labels  do  not  break  the  security  of  the  original 
garbled  circuit  construction.  More  formally,  we  need  to  show 
a  simulator  that,  given  the  output  y,  is  able  to  generate 
simulated  garbled  components  and  linking  labels  that  would 
look  indistinguishable  from  the  true  garbled  circuit. 

We  must  consider  the  view  of  each  party,  where  the 
“view”  includes  any  messages  received  during  the  protocol. 
(Values  computed  and  sent  by  a  party  themselves  cannot 
give  them  additional  information.)  First  we  note  that  the 
view  of  the  garbler  in  this  construction  only  consists  of 
its  side  of  the  OT  protocol  executions.  This  is  the  same 
as  its  view  in  the  standard  garbled  circuit  protocol,  so  no 
additional  security  argument  is  needed. 

Next  we  consider  security  against  a  semi-honest  eval¬ 
uator.  Roughly,  we  can  use  a  slightly  modified  version 
of  the  standard  garbled  circuit  simulator.  This  simulator 
produces  a  garbled  circuit  GC  for  the  overall  circuit  C.  The 
simulator  then  divides  this  circuit  into  components  matching 
the  components  that  were  pre-garbled  by  the  protocol.  These 
garbled  components  are  then  modified  as  follows.  For  each 
output  wire  Wi  of  each  linked  component,  a  random  label 
Wi  is  chosen  and  is  XORed  with  the  output  wire  label.  The 
result  is  a  new  label  for  each  output  wire.  (The  tables  in 
the  final  gate  before  each  output  wire  are  modified  to  match 
the  new  values.)  The  output  wires  still  have  truly  random 
labels,  so  these  simulated  values  are  still  indistinguishable 
from  the  evaluator’s  true  view.  We  now  simply  note  that  the 


random  values  IT)  for  each  component  output  wire  serve 
as  the  simulated  linking  value  that  would  connect  each 
component’s  output  to  the  relevant  input  wires  of  the  next 
component.  They  have  the  same  mathematical  relationship 
to  the  wire  labels  as  the  true  linking  values  do.  Therefore 
the  simulator  has  produced  a  complete  simulation  of  the 
evaluator’s  view,  and  security  is  achieved. 

5.  Implementation 

We  have  implemented  all  the  theoretical  ideas  discussed 
above  in  CompGC,  a  new  system  for  secure  computation 
with  preprocessing.  Here  we  describe  the  implementation 
in  detail,  and  in  the  next  section  we  present  performance 
numbers  from  our  experimental  results. 

CompGC  uses  as  its  primary  building  block  the  libgarble 
library,  which  is  based  on  the  JustGarble  implementation  of 
Bellare  et  al.  IBIIKRI3I.  We  chose  to  usejjbgarble  over 
existing  approaches,  such  as  TinyGarble  PSHS+ 1 5l.  due  to 
its  efficienc)Q  the  fact  that  it  can  be  compiled  as  a  shared 
library,  and  that  it  has  a  consistent  API.  The  libgarble  library 
does  just  what  its  name  implies  —  it  creates  a  garbled 
version  of  a  specified  circuit  and  evaluates  that  circuit  given 
inputs.  It  is  a  tool,  rather  than  a  complete  implementation 
of  secure  computation.  It  does  not  carry  out  the  oblivious 
transfers  (OTs)  necessary  to  share  input,  or  the  networked 
interactions  necessary  to  send  the  garbled  circuit  (or  the 
information  for  the  OT  protocols,  or  the  output)  between 
parties. 

The  libgarble  library  is  based  on  JustGarble,  but  sev¬ 
eral  improvements  have  been  made  to  the  code,  including 
cleaning  up  the  API,  improving  the  structures  for  storing 
the  garbled  circuit,  etc.  With  these  modifications,  we  can 
now  evaluate  an  AES  circuit  in  around  17  cycles/gate,  a 
computation  that  takes  around  22  cycles/gate  on  the  same 
hardware  with  the  original  JustGarble  implementation,  an 
improvement  of  around  22%.  Note  that,  while  implemented 
in  libgarble,  we  do  not  use  the  half-gates  approach  of  Zahur 
et  al.  MZRE15I.  which  reduces  the  size  of  each  garbled  gate 
to  two  labels  at  the  cost  of  two  calls  to  the  hash  function 
H  during  evaluation.  We  instead  use  a  scheme  proposed 
by  Bellare  et  al.  IBHKR131  which  requires  three  labels  be 
transferred  but  only  one  call  to  H  during  evaluation.  As 
we  are  only  concerned  with  the  online  time,  the  benefits 
of  a  smaller  circuit  are  outweighed  by  the  extra  cost  in 
evaluation. 

We  then  use  libgarble  to  build  CompGC.  CompGC  has 
both  an  offline  and  an  online  phase.  In  the  offline  phase, 
CompGC  is  given  a  library  of  components  and  computes  a 
specified  number  of  each  component.  This  library  could  be 
small  and  special-built  for  a  certain  class  of  functions,  or  it 
could  be  a  huge  library  of  many  common  computational 
steps,  meant  to  allow  faster  online  computation  of  most 
realistic  functions. 

1.  Using  libgarble  as  a  building  block,  securely  computing  AES  over 
localhost  using  precomputed  OTs  takes  4.4ms  (cf.  |Table  2\,  whereas 
TinyGarble  using  their  — disable-OT  flag  takes  13ms. 


In  the  offline  phase,  the  garbler  side  of  CompGC  uses 
libgarble  to  generate  and  garble  the  component  circuits. 
The  garbler  saves  the  garbled  component  circuits,  each 
tagged  with  a  unique  ID,  and  input  and  output  labels  to 
disk.  The  garbler  side  also  sends  the  garbled  component 
circuits  and  their  unique  IDs  to  the  evaluator  side,  which 
saves  the  received  data  to  disk.  The  offline  phase  finishes 
by  performing  the  offline  portion  of  OT  preprocessing  as 
described  by  Beaver  ||Bea951. 

We  specify  the  function  that  the  garbler  and  evaluator 
compute  in  the  online  phase  with  a  JSON  file.  The  file 
specifies  what  types  of  components  are  needed  for  the 
computation,  and  how  the  components’  input  and  output 
wires  should  be  connected.  (Another  format  could  be  used 
to  gain  a  small  efficiency  improvement,  but  we  value  the 
fact  that  the  JSON  file  is  human-readable.) 

The  garbler  receives  this  function  and  the  garbler’s  input 
to  the  function  at  the  beginning  of  the  online  phase.  It 
then  generates  a  set  of  instructions  for  the  evaluator.  The 
instructions  specify  particular  pre-shared  garbled  circuits  (by 
ID,  rather  than  just  by  type).  The  instructions  also  specify  an 
order  for  their  evaluation  and  specify  how  to  feed  the  outputs 
of  one  component  into  the  inputs  of  others.  (This  requires 
both  specifying  what  wires  connect  where  and  specifying 
the  relevant  mask  for  each  pair  of  wires  that  are  being 
connected.)  Finally,  the  instructions  include  the  necessary 
information  to  convert  the  output  wire  labels  to  bits,  as  well 
as  the  wire  labels  for  the  garbler’s  input.  The  garbler  sends 
these  instructions  to  the  evaluator. 

Next,  the  garbler  and  evaluator  perform  the  online  phase 
of  preprocessed  oblivious  transfer,  resulting  in  the  evaluator 
having  input  labels  corresponding  to  its  input.  The  evaluator 
now  has  all  of  the  information  necessary  to  perform  the 
computation.  It  evaluates  each  component  using  libgarble 
(in  an  order  specified  by  the  instructions),  and  computes  the 
input  labels  for  each  component  from  either  input  labels  or 
processing  the  output  of  a  previous  component.  Finally,  the 
evaluator  computes  the  final  output  (and  can  then  send  it 
back  to  the  garbler). 

6.  Experimental  Results 

In  this  section  we  describe  two  classes  of  experiments 
that  we  performed  to  demonstrate  the  efficiency  gains  pro¬ 
vided  by  CompGC.  The  first  of  these  demonstrates  the 
performance  improvements  over  traditional  garbled  circuit 
computation  for  several  useful  classes  of  computation.  The 
second  of  these  shows  the  efficiency  of  using  CompGC  to 
compute  several  machine  learning  classification  algorithms. 

6.1.  Improvements  over  Naive  Garbled  Circuits 

We  compared  CompGC  with  the  traditional  setting 
where  the  entire  circuit  is  transferred  online.  We  imple¬ 
mented  a  semi-honest  protocol  using  libgarble  in  which  the 
parties  preprocess  OTs  in  an  offline  stage,  but  the  circuit 
garbling  and  transfer  is  done  online.  This  is  the  closest 
setting  to  our  work,  as  we  assume  that  the  parties  do  not 
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CBC  mode  (e.g.,  any  function  that  uses  a  blockcipher).  Like¬ 
wise,  we  could  break  the  Levenshtein  core  circuit  into  its 


Figure  1:  Levenshtein  core  circuit  (taken  from  Figure  5(c) 
from  the  work  of  Fluang  et  al.  IIIHKMI 1 1). 


know  which  circuit  they  would  like  to  compute  until  the 
online  stage. 

Experimental  setup.  All  experiments  in  this  section  were  run 
on  an  Intel®  Core™  i5-4210H  CPU  @  2.90GHz  with  8  GB 
of  RAM,  and  were  conducted  over  two  network  settings. 
We  used  the  built  in  Linux  emulator  netem  to  configure 
localhost  to  have  a  latency  of  33  ms  (the  average  latency 
in  the  United  States  M)  and  a  bandwidth  of  50  Mbits/sec. 
We  chose  to  use  a  simulated  network  due  to  the  ease  of 
controlling  the  latency  and  bandwidth  as  well  as  the  ease  of 
reproducibility. 

We  ran  four  experiments:  AES,  CBC  mode,  and  Leven¬ 
shtein  distance  using  both  30  and  60  symbols.  We  discuss 
each  experiment  in  turn. 

AES:  We  treat  each  round  of  AES  as  a  separate  component. 
Thus,  computing  AES  involves  linking  together  10 
components  (for  each  of  the  10  rounds  of  AES  when 
considering  128-bit  inputs). 

CBC  mode:  This  algorithm  provides  a  way  of  encrypting 
variable  length  messages  using  a  blockcipher  (in  our 
case,  AES)  as  an  underlying  building  block.  We  use 
the  same  single-AES-round  components  as  the  above 
experiment,  along  with  an  XOR  component.  Our  ex¬ 
periment  involves  running  CBC  mode  over  a  10  block 
message,  and  thus  we  use  110  components  (100  for  the 
AES  rounds  and  10  for  the  XOR  components). 
Levenshtein  distance:  This  algorithm  provides  a  measure 
of  distance  between  two  strings.  We  use  as  the  core 
component  the  Levenshtein  core  circuit  as  explained  by 
Huang  et  al.  IHEKM111;  see  also  Figure  1  .  We  use  an 


8-bit  alphabet  and  run  Levenshtein  distance  over  strings 
containing  both  30  and  60  symbols,  which  corresponds 
to  900  and  3600  components,  respectively. 

We  note  that  these  experiments  are  just  a  sample  of  what 
can  be  done  using  our  tool.  While  the  components  we  use 
are  particular  to  our  experiments,  we  note  that,  for  example, 
an  AES  circuit  could  be  used  in  other  systems  besides  just 


components  (such  as  2-MIN  and  AddOneBit;  see  Figure  1 1 
which  can  likely  be  used  in  other  circuit  constructions. 

Experimental  results.  Table  2  presents  the  results  of  the 


above  experiments  over  our  simulated  network.  We  compare 
the  running  times  of  both  standard  semi-honest  secure  two- 
party  computation  with  the  OTs  preprocessed,  which  we 
denote  as  Naive,  and  our  component-based  garbled  circuit 
protocol,  which  we  denote  as  CompGC.  We  execute  100 
runs  of  each  experiment,  reporting  the  average  and  the  95% 
confidence  interval.  Looking  at  the  running  times  on  the 
simulated  network  we  see  drastic  improvements  of  upwards 
of  an  order  of  magnitude  for  CBC  mode  and  Levenshtein 
using  60  symbols,  as  well  as  significant  improvements  for 
the  other  two  cases.  We  can  see  why  this  is  the  case 
by  looking  at  the  total  communication  of  each  approach; 
CompGC  demonstrates  the  greatest  time  improvement  for 
those  experiments  with  the  greatest  communication  improve¬ 
ment. 

As  the  main  use  of  CompGC  is  for  more  efficient  online 
running  time,  we  did  not  optimize  the  offline  time  (we  do 
not  use  OT  extension  and  do  not  use  a  highly  optimized  OT 
implementation).  However,  we  note  that  our  offline  phase  is 
still  relatively  efficient:  around  30ms  for  AES  and  around 
450ms  for  CBC  mode  and  Levenshtein  with  60  symbols, 
all  over  local  hos0  Thus,  we  are  not  achieving  efficient 
online  secure  computation  at  the  cost  of  an  expensive  offline 
phase:  the  offline  phase  involves  only  preprocessing  OTs  and 
garbling  and  sending  garbled  circuits. 

From  these  experiments,  we  validate  the  belief  that 
communication  is  the  bottleneck  for  semi-honest  secure 
two-party  computation  using  garbled  circuits  on  realistic 
networks,  and  demonstrate  that  component-based  garbling 
provides  a  powerful  technique  for  reducing  this  bottleneck. 

6.2.  Machine  Learning  Classification 

Our  next  set  of  experiments  aims  to  examine  a  setting 
where  a  small  library  of  simple  components  can  enable  a 
variety  of  computations  in  a  particular  domain.  We  choose 
private  classification  because  it  gives  a  good  illustration  of 
this  sort  of  use,  and  because  it’s  an  area  that  others  have 
claimed  is  difficult  for  garbled  circuit-based  protocols  in  the 
past. 

The  key  prior  work  on  private  classification  is  that  of 
Bost  et  al.  They  claim  that  garbled  circuit  protocols  cannot 
feasibly  compute  classification  functions  (due  to  memory 
constraints)  and  then  give  specialized  protocols  that  can.  We 
first  show  that  standard  “naive”  garbled  circuits  can  indeed 
compute  all  three  of  the  classification  functions  considered 
by  Bost  et  a/0  The  efficiency  is  incomparable;  two  of  the 

2.  As  a  comparison  point.  Tiny  Garble  takes  around  120ms  for  AES. 

3.  The  reason  for  this  discrepancy  is  that  the  failure  Bost  et  al.  claim 
occurs  when  they  attempt  to  use  an  automated  tool  to  generate  the  circuits 
for  the  classification  functions.  We  simply  specify  the  circuits  by  hand.  We 
believe  this  is  the  more  fair  comparison,  since  specifying  the  circuits  is 
certainly  easier  than  writing  special-purpose  protocols. 


Time  (simulated) 


Comm. 


Naive 

CompGC 

Naive 

CompGC 

AES 

115.7  ±  0.7 

66.4  ±  0.7 

2.7 

0.7 

CBC  mode 

580.4  ±  1.7 

141.6  ±  0.3 

26.6 

7.4 

Leven.  (30) 

1249.0  ±  1.0 

137.7  ±  1.0 

58.7 

10.0 

Leven.  (60) 

5827.0  ±  1.2 

384.2  ±  0.4 

286.7 

43.8 

TABLE  2:  Experimental  results;  see  Section  6  for  the  experimental  setup.  Leven.  (XX)  denotes  Levenshtein  distance  over 
strings  containing  XX  symbols.  All  times  are  in  milliseconds  and  all  communication  is  in  megabits.  Naive  denotes  our 
implementation  of  standard  semi-honest  2PC  using  garbled  circuits  and  preprocessed  OTs  using  libgarble,  whereas  CompGC 
denotes  our  component-based  implementation.  Time  is  (online)  computation  time,  not  including  the  time  to  preprocess  OTs, 
but  including  the  time  to  load  data  from  disk.  All  timings  are  of  the  evaluator’s  running  time,  and  are  the  average  of  100 
runs,  with  the  value  after  the  ±  denoting  the  95%  confidence  interval.  The  communication  reported  is  the  number  of  bits 
received  by  the  evaluator. 


three  functions  are  significantly  faster  using  the  specialized 
protocols,  while  the  third  is  much  faster  using  garbled 
circuits  (cf.  |Table  3|). 

We  then  use  our  component-based  approach  and  find,  as 
before,  that  we  can  substantially  improve  online  efficiency 
compared  to  prior  approaches.  The  resulting  speeds  are  all 
faster  than  those  achieved  by  Bost  el  al.,  sometimes  by  large 
margins.  This  shows  that  with  precomputation,  even  without 
knowing  the  classification  function(s)  that  will  be  needed 
ahead  of  time,  garbled  circuits  can  produce  faster  online 
times  than  even  special-purpose  protocols.  Furthermore,  the 
components  we  use  all  carry  out  simple  operations  that  we 
expect  would  be  useful  for  a  large  variety  of  functions,  not 
only  private  classification. 

Experimental  Setup.  As  before,  all  experiments  are  run 
on  an  Intel®  Core™  i5-4210H  CPU  @  2.90  GHz  with 
8  GB  of  RAM.  For  better  comparison,  we  modify  our 
network  settings  here  to  better  match  those  used  by  Bost  el 
al.  IBPTG15I.  Specifically,  we  use  two  emulated  networks. 
The  first  has  20  ms  latency  and  no  bandwidth  limitations, 
matching  that  of  Bost  et  al.  To  demonstrate  a  “worst  case” 
for  our  setting,  we  also  measure  performance  over  a  net¬ 
work  with  20ms  latency  and  a  bounded  bandwidth  of  50 
Mbits/sec  0 

As  in  Bost  et  al.,  we  ran  experiments  to  evaluate  per¬ 
formance  of  three  different  machine  learning  classifiers:  hy¬ 
perplane  decision,  naive  Bayes,  and  decision  tree  classifiers. 
We  briefly  describe  these  classifiers  below.  These  classi¬ 
fiers  cover  a  wide  range  of  machine  learning  algorithms, 
because  many  different  machine  learning  algorithms  output 
classifiers  of  the  same  type.  (For  example,  perceptron,  least 
squares,  and  support  verctor  machine  algorithms  all  output 
hyperplace  decision  classifiers.)  We  follow  the  presentation 
of  1BPTG15I  (see,  e.g.,  I1BN07I  for  more  details  about  the 
particular  machine  learning  algorithms  we  consider). 

For  all  the  classifiers,  the  user’s  input  x  = 
(x\ ,...,Xd)  £  Rd  is  called  a  feature  vector.  Classifying 
an  input  x  according  to  a  model  w  amounts  to  evaluating 
a  function  Cw  :  Rd  — >  {ci, . . . ,  Ck}  on  i  to  classify  x  into 

4.  We  note  that  [BPTG151  does  not  impose  a  bandwidth  limit  on  the 
network  for  their  experiments. 


class  Cfc»  for  k*  £  {1 . . .  k}.  For  ease  of  notation,  we  often 
write  k*  instead  of  c^*. 

Hyperplane  decision-based  classifier:  For  this  classifier 
the  model  consists  of  k  vectors  {w\, . . . ,  Wk]  each  in 
Rrf  and  the  classifier  function  is 

k*  =  argmaxiG[i-](wi,  x) 

where  (■,■)  represents  an  inner  product.  Commonly, 
as  will  be  the  case  in  all  of  our  experiments,  the 
model  consists  of  only  a  single  vector  w  and  instead  of 
argmax,  the  classifier  determines  whether  (w,x)  >  0. 

Naive  Bayes:  For  this  classifier,  the  model  con¬ 
sists  of  the  probabilities  that  each  class  Cj 
occurs  ({Pr[C  =  Ci]}f=1)  and  the  conditional 
probabilities  that  the  jth  element  of  x  is 
equal  to  some  value  v  when  x  is  in  class  Ci 
(i.e.,  {{{Pr[x7-  =  v\C  =  G]}ver>J^=1}f=1).  The 
classifier  aims  to  classify  an  input  x  into  the  class 
resulting  in  the  highest  posterior  probability: 

k*  =  argmaxie [*.]  Pr[C  =  c* \X  =  x\. 

The  naive  Bayes  model  additionally  assumes  that  the 
features  of  x  are  independent  so  that 

Pr[C  =  ci,  X1  =  x1, . . . ,  Xd  =  xd\  = 

d 

Pr[C  =  Ci]  ]^[  Pr[A7,-  =  Xj\C  =  ct], 

j= i 

Decision  Tree:  For  this  classifier,  the  model  is  given  as 
a  binary  tree  T  where  each  internal  node  specifies  a 
partition  rule  on  one  feature  of  x  and  each  leaf  node 
corresponds  to  a  class.  Specifically,  each  internal  node 
is  labelled  with  an  index  i  £  \d]  and  a  value  w,  and 
the  rule  checks  whether  Xi  <  Wi.  Classification  is  done 
by  traversing  this  tree  according  to  the  partition  rules 
and  outputting  the  class  corresponding  to  the  leaf  node 
reached. 

Classifier  components.  As  already  pointed  out  by  Bost  et 
al.,  all  the  classifiers  above  can  be  assembled  using  only 
three  basic  components:  comparisons,  inner  products,  and 


Figure  2:  Hyperplane  decision  classifier  circuit. 


Figure  4:  Naive  Bayes  classifier  circuit. 
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Figure  3:  One  feature  hyperplane  decision  classifier  circuit. 


an  argmax  computation.  This  makes  these  classifier  compu¬ 
tations  ideal  for  the  component-based  approach  as  a  small 
collection  of  pre -garbled  components  is  sufficient  to  evaluate 
all  of  these  classifiers  without  knowing  which  one  will  be 
desired  at  preprocessing  time.  Below  we  show  how  circuits 
for  these  classifiers  can  be  constructed  from  a  slightly  larger 
set  of  components. 

Figure  [2]  shows  how  to  build  a  hyperplane  decision 
classifier  out  of  inner  product  and  argmax  components.  If, 
as  is  the  case  in  the  data  sets  used  by  Bost  et  al.,  the  model 
only  consists  of  a  single  vector  w,  an  even  simpler  variant 
of  this  classifier  can  be  built  using  a  single  inner  product 
component  and  a  component  that  compares  to  zero.  This  is 
shown  in  Figure  [3] 

Figure  [4]  shows  how  to  build  a  naive  Bayes  classifier 
circuit  using  addition,  select,  and  argmax  components.  The 
select  component  is  one  that  takes  an  index  and  selects  an 
element  at  that  index  from  an  array.  We  note  that  just  as  in 
Bost  et  al.,  we  work  with  the  logarithms  of  probabilities,  so 
that  products  of  probabilities  can  be  computed  as  sums. 

We  implement  a  decision  tree  classifier  by  having  the 
circuit  mimic  the  structure  of  the  decision  tree.  Figure  [5] 
shows  the  circuit  portion  corresponding  to  a  single  internal 
node  of  the  decision  tree.  This  just  consists  of  a  single 
comparison  component  together  with  a  few  basic  boolean 


Figure  5:  Decision  tree  classifier  node  evaluation  circuit. 


logic  gates.  It  takes  as  input  a  bit  indicating  whether  or  not 
this  node  is  on  the  path  of  the  decision  tree  that  would  be 
traversed  for  the  given  feature  vector.  It  outputs  two  bits  that 
indicate  the  same  thing  for  each  of  its  children.  As  the  tree 
is  evaluated,  the  traversed  path  is  computed,  and  the  leaf 
node  that  is  reached  receives  a  1  while  the  others  receive 
Os.  Leaf  nodes  then  simply  output  their  corresponding  label 
if  they  receive  1  and  a  null  value  otherwise. 

We  stress  that  all  the  components  used  in  these  con¬ 
structions  are  simple,  standard  operations  and  could  serve  as 
building  blocks  for  many  additional  functionalities  besides 
the  ones  discussed  here.  Thus,  while  here  we  consider  a 
more  limited  case  where  a  component  library  is  specially- 
tailored  for  computing  private  classification,  a  large  (but  still 
quite  modest)  library  would  enable  both  private  classifica¬ 
tion  and  a  variety  of  other  functions. 


Experimental  Results.  Table  3|  presents  the  results  of  the 
above  experiments  for  machine  learning  classification.  We 
evaluate  our  secure  classification  algorithms  on  the  same 


Naive 


CompGC 


Bost  et  al.  iBPTGISl 


Data  Set 

Size 

Time 

Time* 

Comm. 

Time 

Time* 

Comm. 

Time 

Comm. 

Rounds 

Cancer 

30 

299 

71 

56 

0.7 

204 

0.3 

3.5 

Credit 

47 

402 

122 

65 

1.1 

217 

0.3 

3.5 

(a)  Hyperplane  decision  classifier.  “Size”  is  the  length  of  the  model  vector  w. 

Specs.  Naive  CompGC  Bost  et  al.  MBPTG15I 

Data  Set  C  F  Time  Time*  Comm.  Time  Time*  Comm.  Time  Comm.  Rounds 


Cancer  2  9  247  67  99  58  479  0.6  7 

Nursery  5  9  528  206  166  185  1415  1.2  21 

(b)  Naive  Bayes  classifier.  “C”  is  the  number  of  classes  and  “F”  is  the  number  of  features. 

Specs.  Naive  CompGC  Bost  et  al.  IBPTG151 


Data  Set  N  D  Time  Time*  Comm.  Time  Time*  Comm.  Time  Comm.  Rounds 


Nursery  4  4  40  0.3  40  0.01  2085  21.6  15 

ECG  6  4  40  0.4  40  0.1  8816  29.1  22 


(c)  Decision  tree  classifier.  “N”  is  the  number  of  internal  nodes  in  the  tree  and  “D”  is  its  depth. 


TABLE  3:  Experimental  results;  see  Section  6.2  for  the  experimental  setup.  Times  are  in  milliseconds  and  communication 
is  in  megabits.  Naive  denotes  our  implementation  of  standard  semi-honest  2PC  using  garbled  circuits  and  preprocessed  OTs 
using  libgarble,  whereas  CompGC  denotes  our  component-based  implementation.  “Time”  is  (online)  computation  time,  not 
including  the  time  to  preprocess  OTs,  but  including  the  time  to  load  data  from  disk.  “Time*”  reports  the  running  time  over 
a  network  with  bounded  bandwidth.  All  timings  are  of  the  evaluator’s  running  time,  and  are  the  average  of  100  runs.  The 
communication  reported  is  the  number  of  bits  received  by  the  evaluator.  The  time  and  communication  of  Bost  et  al.  are  as 
reported  in  IIBPTG151.  “Rounds”  denotes  the  number  of  round  trips  between  parties.  For  CompGC,  the  number  of  rounds 
is  one  throughout. 


data  sets  used  by  Bost  et  a/fj  which  come  from  the  UCI 
machine  learning  repository  1BL13I  and  from  ||BFK+09|. 
We  use  simple,  straightforward  circuits;  our  results  could  be 
improved  by  carefully  optimizing  the  component  circuits. 

For  all  of  the  classifier  classes  (when  not  bounding 
the  network),  our  CompGC  technique  leads  to  significantly 
improved  performance  that  beats  the  prior  work.  For  the  hy¬ 
perplane  decision  and  naive  Bayes  classifiers  we  are  roughly 
2x  faster  than  Bost  et  al .,  while  for  the  case  of  decision  trees 
our  performance  is  over  an  order  of  magnitude  faster.  This 
is  due  to  the  fact  that  the  former  two  classifiers  make  heavy 
use  of  arithmetic  gates  such  as  inner  product  and  addition 
which  work  well  with  the  homomorphic  encryption-based 
tools  used  by  Bost  et  al.  However,  the  decision  tree  classifier 
only  uses  comparison  and  boolean  operations  making  it  ideal 
for  a  garbled  circuit-based  approach. 

When  bounding  the  network,  we  see  only  a  small  change 
in  our  performance.  For  all  but  the  settings  which  require 
large  communication,  we  see  a  very  similar  performance 
to  the  unbounded  bandwidth  caseFj  The  only  setting  that 
performs  worse  is  the  naive  Bayes  classifier,  due  entirely  to 
the  communication  cost  in  our  case.  Thus,  as  bandwidth  in¬ 
creases  we  will  approach  the  unbounded  bandwidth  running 
time.  We  also  note  that  the  number  of  interactions  required 
in  our  approach  is  (often  much)  less  than  the  protocols  of 

5.  We  do  not  include  the  audiology  naive  Bayes  data  set  due  to  a  design 
decision  in  our  implementation  that  stores  the  circuits  in  memory  rather 
than  pipelining  them  from  disk;  thus,  we  run  out  of  memory  when  trying 
to  load  the  necessary  garbled  circuits.  We  stress  that  this  is  an  artifact  of 
our  implementation  and  not  an  inherent  issue  with  our  approach. 

6.  The  reason  it  appears  that  the  bounded  bandwidth  cases  performs 
better  than  the  unbounded  case  for  the  hyperplane  decision  classifier  is 
due  to  both  results  falling  within  the  margin  of  error  or  each  other. 


Bost  et  al.  Thus,  as  the  latency  increases,  our  approach 
will  likely  beat  the  prior  work,  especially  when  considering 
larger  bandwidths  than  what  we  considered  here. 

Looking  at  the  performance  of  naive  Yao,  we  see  that 
it  is  indeed  slower  than  the  protocols  of  Bost  et  al.  for  all 
except  for  the  decision  tree  classifier.  However,  it  is  not 
nearly  as  slow  as  suggested  by  Bost  et  al.,  leaving  hope 
that  this  gap  could  be  reduced  by  more  efficient  circuit 
constructions  minimizing  the  number  of  non-XOR  gates. 
The  slower  performance  of  garbled  circuits  is  due  to  the 
huge  increase  in  total  communication  needed  to  transmit 
the  garbled  circuit,  and  is  especially  evident  over  a  bounded 
bandwidth  network. 

Thus,  we  view  our  results  as  showing  that  generic  tech¬ 
niques  can  in  fact  beat  the  performance  of  Bost  et  al.,  at  least 
when  it  comes  to  the  online  computation  time.  Additionally, 
the  improvement  over  the  naive  Yao  running  times  demon¬ 
strates  the  power  of  our  CompGC  approach,  especially  in 
settings  where  the  components  can  be  preprocessed  but  the 
exact  algorithm  to  be  run  may  only  be  known  during  the 
online  phase. 

7.  Conclusion 

Our  new  technique,  component-based  garbled  circuits, 
allows  precomputation  to  greatly  reduce  online  computation 
time  for  secure  two-party  computation.  For  functions  we 
tested,  the  online  computation  time  was  substantially  re¬ 
duced,  often  by  order  of  magnitude.  This  is  done  by  decreas¬ 
ing  the  amount  of  data  that  must  be  communicated  during 
the  online  phase.  While  in  principle  one  could  construct 
functions  for  which  our  technique  is  unlikely  to  produce 


more  than  50%  savings  with  any  realistic  set  of  precomputed 
components,  the  benefit  for  realistic  functions  is  much, 
much  greater. 

We  also  show  that  this  technique  compares  well  to  the 
private  classification  algorithms  of  Bost  et  al.  This  means 
that  with  precomputation  general  garbled  circuit  protocols 
can  compete  with  (and  sometimes  significantly  outperform) 
special-purpose  protocols. 

We  have  shown  this  in  several  cases  where  the  general 
type  of  function  is  known  ahead  of  time,  but  the  specifics 
(e.g.,  input  length)  are  not.  We  also  show  functions  that  can 
benefit  from  general  libraries  geared  towards  cryptographic 
or  machine  learning  computations.  In  the  case  of  private 
classification,  the  components  used  are  so  general  that  they 
would  likely  be  found  in  any  general-purpose  library  of 
components,  showing  we  think  that  such  a  library  could 
feasibly  be  designed  to  be  both  reasonably-sized  and  very 
useful. 

We  work  only  in  the  two-party  and  semi-honest  settings, 
but  multi-party  and  malicious  settings  could  be  amenable  to 
a  similar  technique.  We  leave  the  task  of  designing  specific 
protocols  for  these  settings  as  future  work. 
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